Code: Monte-Carlo Simulation of Solar Wind Uncertainty and its Effect on Polar Cap Potential Saturation

by Nithin Sivadas
Note: Please do not share without Nithin's explicit permission.
For submission to the Journal Science.
Update: 7th October 2021; 27th Sep 2021; 20th Aug 2021; 4th July 2021
Table of Contents

Introduction

The cross polar cap potential saturates at strong solar wind forcing. In particular, the combined polar cap index (PCC) - a proxy for the polar cap potential, saturates at high values of the solar wind "merging" electric field (). There are at least 9 models that attempts to explain this non-linear relationship. All of these models propose a physical mechanism that reduces the day-side reconnection rate or the potential in the ionosphere, for a given solar wind driving (Borovsky 2009, Table A1). However, we propose an alternate explanation: the uncertainty in solar wind measurements.
Through this code, we demonstrate that the saturation of the polar cap potential is a result of our uncertainty in
  1. the propagation delay of solar wind from L1 to the bow shock,
  2. the propagation of the day-side electric field from the magnetopause to the polar cap ionosphere
  3. the random variation in the magnitude of the solar wind electric field as it propagates from L1 into the polar cap (intrinsic scatter)
The propagation delay of the effect of the solar wind forcing from L1, all the way to the polar cap ionosphere, is stochastic. And as a result, there is a small but random error in our best estimates of the propagation delay. Additionally, there could be a random (zero-mean) variability in the magnitude of the solar wind geoeffective field as it propagates to the ionosphere, which could result in additional uncertainty in our best estimates of the geoeffective field mapped to the polar cap. The uncertainty in our estimate of the geoeffective field mapped to the polar cap, varies with the magnitude of the field. Combining this property with the fact that the underlying probability distribution of the geoeffective field is a log-normal distribution, results in a biased estimate of solar wind field at the polar cap.
In the following code, we develop a stochastic model (a monte-carlo simulation) of the relationship between our estimate of the solar wind merging electric field mapped to the polar cap ( ) and its hypothetical true value (X). There is reason to believe that mapping the geoeffective field accurately to the polar cap leads to the combined polar cap index () (Stauning 2012). We use this assumption only to infer that the underlying probability distribution and autocorrelation function of X is similar to PCC. Furthermore, we constrain the error in , to be not more than the difference in . Through our simulation, we show that our estimate of the solar wind merging electric field in the polar cap , is a biased estimate of X, by making informed assumptions of merely the stochastic nature of X and our uncertainty in the propagation delay and intrinsic scatter.
The most important result is that our bias, varies non-linearly with X. The reason for this non-linear bias in the estimate is primarily due to the misidentification of far more lower values of X as high values. And we find that X saturates with our estimate , in the same way as the polar cap potential saturates with our estimate of . Firstly, this shows that our assumption, that an accurate mapping of to the polar cap, will lead it to be exactly linear with PCC (with a slope), is consistent with the model. And if the stochastic model's relationship between X and is true, and the uncertainties in propagation delay and intrinsic scatter are true, then the non-linearity between polar cap potential and solar wind forcing will follow logically from it.
In other words, given our uncertainty in the solar wind propagation from L1 to the polar cap, we should expect the exact statistical non-linear relationship we observe between the polar cap potential and solar wind merging electric field. Finally, we show that the same model predicts almost exactly the non-linear behavior between auroral electrojet current strength (SML) and the merging electric field (Em), provided we assume that the currents linearly increase with the true polar-cap potential.

Extract Data

Input Data Files

%Specify the folder where all the data is stored. See: https://github.com/nithinsivadas/uncertainty/tree/main/Data
DataDir = 'C:\Users\nithin\Documents\GitHub\uncertainty\Data\';
superMagFile = [DataDir,'20210927-20-45-supermag.txt'];
From SuperMag (JHU/APL)
windDataFolder = [DataDir,'Data\omni_components\'];
From OmniWeb WIND Spacecraft measurement (NASA)
polarCapDataFolder = [DataDir,'pc_index\PCI\'];
From https://pcindex.org/archive (Arctic and Antarctic Research Institute, and the Technical University of Denmark)

Output Folder

%Specify the output folder where you'd like to store the images
outputFolder = 'G:\My Drive\Research\Projects\Paper 6\Draft\Version 3\Science\Figures\New1\';
[wind, pci, Fsml, Xm, XmTime, Fpcc, acf_fit_Xm, acf_fit_PCC, acf_fit_var]= initialize(superMagFile,windDataFolder,polarCapDataFolder);
PCC = Fpcc(datenum(XmTime));
Defining data we use
For solar wind geoeffective field values, we calculate that using the time-shifted WIND measurements from the OMNI2 database.
For the polar cap potential, we use the polar cap index data set.
Xm - The solar wind geoeffective field, derived from OMNI measurements. , in [mV/m]
PCC - Is the combined polar cap potential, in [mV/m]. It is a measure of the geoeffective field observed in the ionosphere.
We also estimate average autocorrelation for the above variables.
acf_fit_Xm - The autocorrelation function associated with the measurement Xm
acf_fit_PCC - The autocorrelation function associated with the measurement PCC
acf_fit_var - The autocorrelation function associated with

Toy Model

We develop a stochastic model,
where X corresponds to PCC, and corresponds to that is time shifted with a constant propagation delay from L1, to nose, to the ionosphere.
How we arrived at this relationship between and X, is described in the detailed diagram below. Here, t is the time of the measurement at L1.
propagation_delay_cartoon.png
Inspired from data, we assume input distribution and ACFs for X.
We assume the intrinsic scatter in X, as is mapped down to the polar cap ionosphere
We also assume an autocorrelation for the , inspired from the ACF observed in the difference
corresponds to variability caused by intrinsic scattering of geoeffective field as it is mapped down to the ionosphere. Sources of this scattering might be - the slowing of the solar wind as it travels from L1 to nose, the influence of ionoshperic conductance on the electric field. It is a heteroscedastic variability, i.e., the variance increases with increasing value of X. And we select a functional form for it, that matches with the variance we potentially observe in the data.
We also assume the uncertainty in propagation delays.
A step by step description and generation of the inputs and assumptions follow.

Inputs

Probability Distribution Function of X

The underlying probability distribution function of X is an input to the model. It is a lognormal distribution, chosen to match the the probability distribution of the polar cap electric field PCC.
set(0,'defaulttextInterpreter','latex');
f1 = figure;
resize_figure(f1,2*55,2*55);
XBinsLog = logspace(-1,3,100);
histogram(Xm,XBinsLog,'Normalization','pdf','DisplayStyle','stairs','LineWidth',1,'EdgeColor','k');
hold on;
histogram(PCC,XBinsLog,'Normalization','pdf','DisplayStyle','stairs','LineWidth',1,'EdgeColor',[0 0.5 0]);
hold on;
plot(XBinsLog,lognpdf(XBinsLog,-0.2518,0.85),'m','LineWidth',1);
set(gca,'XScale','log','YScale','log');
ylim([10^-10,10]);
legend('Data: $\hat{E}^*_m$','Data: $\hat{E}_m = PCC$','Model: $\hat{X} = X_{PC}$','Interpreter','latex');
xlabel('[mV/m]');
ylabel('pdf');
title('Probability distributions of data and model','Interpreter',"none");

Autocorrelation Function of X and ϵ

The autocorrelation of the stochastic process, , that we are trying to model, is also an input to the model. We derive it from the autocorrelation of PCC. Similarly, the intrinsic scatter is also modelled as a stochastic process whose ACF is assumed to be the ACF of the error/variability .
f2 = figure;
resize_figure(f2,2*55,2*55);
tInd = 0:1000;
plot(tInd./60,acf_fit_Xm(tInd),'k');
hold on;
plot(tInd./60,acf_fit_PCC(tInd),'--k');
hold on;
plot(tInd./60,acf_fit_var(tInd),'Color',[0 0.5 0]);
legend('Data: ACF of $E^*_m$ at Nose','Data: ACF of $PCC$ at Ionosphere','Data: ACF of $E^*_m - PCC$','Interpreter','latex');
title('Autocorrelation Functions','From measurements','Interpreter',"none");
xlim([-1, 14]);
ylim([0,1]);
xlabel('$\tau$ [Hours]')
ylabel('$r_{xx}(\tau)$')
dt = 1; % 1 min intervals
nSamplesM = 2^12; % Upto 80 hours
nEnsemblesM =16000;
lag = fftshift(-nSamplesM:1:nSamplesM-1)'.*dt;
% Generating auto-correlation matrices for the model
Rm = acf_fit_PCC(abs(lag));
RmErr = acf_fit_var(abs(lag));
% Input 2 : The autocorrelation matrix/function
RmMatrix = toeplitz(Rm(find(lag==0):find(lag==(nSamplesM-1)*dt)));
RmErrMatrix = toeplitz(RmErr(find(lag==0):find(lag==(nSamplesM-1)*dt)));
% Input 1: Probability distribution
X = MvLogNRand(repmat(-0.2518,nSamplesM,1),repmat(0.85,nSamplesM,1),nEnsemblesM,RmMatrix); %% PCC Substitute in model

Assumptions

The different forms of variability (Figure S3,S4)

Propagation delay uncertainty

The solar wind geoeffective field takes time to propagate from L1 to Ionosphere, where is our estimate of the propagation time.
is our estaimte of the solar wind propagation from L1 to nose
is our estimate of the propagation of disturbances from the nose to the polar cap ionosphere (usually a constant 20 minutes)
Usually,
Where is the solar wind electric field measurement time. Where and are constant delays. However, these constant delay offsets are not the real story, since these propagation delays are stochastic.
As a result, we assume a distribution for the propagation delay uncertainties and , using help from the literature.
(Case and Wild, 2012)
(Stauning 2012, Section 5, Pg 369-372)
XBins = 0:1:40;
sigma1 = 8; %8 minutes
sigma2 = 25; %20 minutes
tsubset = 100:1:nSamplesM;
nSubsetSample = length(tsubset);
timeM = 1:1:nSamplesM; %% Time at Ionosphere
mean1 = 0;
mean2 = 20;
shape1 = 1.3;
shape2 = 1.3;
% L1 to Nose, propagation delay uncertainty
terr1 = (random('tlocationscale',mean1,sigma1,shape1,nSamplesM,nEnsemblesM))';
% Nose to Ionosphere, propagation delay uncertainty
terr2 = (random('Weibull',sigma2,shape2,nSamplesM,nEnsemblesM))'-mean2;
% Time delay for relationship between XI and WI
t_WI_XI = timeM + terr1 + terr2;
The distributions are shown in the plot below.
figS3 = figure;
resize_figure(figS3,2*55,2*55);
tbins = -100:1:100;
histogram(terr1,tbins,'Normalization',"pdf",'DisplayStyle','stairs','LineWidth',1,'EdgeColor','k');
hold on;
histogram(terr2,tbins,'Normalization',"pdf",'DisplayStyle','stairs','LineWidth',1,'EdgeColor','b');
hold on;
histogram(terr2 + terr1,tbins,'Normalization',"pdf",'DisplayStyle','stairs','LineWidth',1,'EdgeColor','m','LineStyle','--');
xlabel('$\tau$ [mins]');
ylabel('pdf($\tau$)');
legend('$dt_1$','$dt_2$','Total: $dt_1 + dt_2$ ','Interpreter','latex','Location','best');
title('Propagation delay uncertainty','Interpreter',"none");
ylim([0,0.05]);
export_fig([outputFolder,'FigureS3.png'], ...
'-r600','-png','-nocrop',figS3);
export_fig([outputFolder,'FigureS3.pdf'], ...
'-r600','-pdf','-nocrop',figS3);
Warning: export_fig currently supports transparent patches/areas only in PNG output.
To export transparent patches/areas to PDF, use the print command:
print(gcf, '-dpdf', 'G:\My Drive\Research\Projects\Paper 6\Draft\Version 3\Science\Figures\New1\FigureS3.pdf');

Figure S3: Propagation delay uncertainty

Other Uncertainty

We assume , to capture the random changes in the magnitude of the geoeffective field as it propagates from L1 to the ionosphere. And this scatter, is heteroscedastic in nature, i.e., it varies with the magnitude of X. The plot below shows the nature of this variation that we have chosen for the model. This is inspired from the variation between measured PCC and . Ultimately, most of the variability in the intrinsic scatter is a result of the fact that the magnetospheric state can vary based on differing conditions for the same solar wind electric field. The variation in the state can be a result of the ionospheric conductance, plasma composition in the magnetosphere, total energy in the magnetotail etc.
XBins = 0:1:40;
figS4 = figure;
resize_figure(figS4,2*55,2*55);
plot(0.5.*f(XBins,4,2500)./XBins,XBins,'m');
xlim([0,2]);
ylim([-0,40]);
xlabel('standard deviation of $\epsilon(t)$ normalized by $\hat{X}$');
ylabel('$\hat{X}$');
title('Other uncertainty: $\epsilon(t)$ varying with $\hat{X}$','Interpreter',"latex");
set(gca,'YTick',[0,12,40]);
export_fig([outputFolder,'FigureS4.png'], ...
'-r600','-png','-nocrop',figS4);
export_fig([outputFolder,'FigureS4.pdf'], ...
'-r600','-pdf','-nocrop',figS4);
Figure S4: Other uncertainty
is set to have a normal probability distirbution, whose standard deviation is a function of X shown by the above figure, with an autocorrelation function derived from the observed autocorrelation of .
HNoise = @(X,Sigma,nEnsembles,nSamples,RmMatrix) (f(X,4,2500)).*(MvNRand(repmat(0,nSamples,1),repmat(Sigma,nSamplesM,1),nEnsembles,RmMatrix));
HNoiseX = HNoise(X,0.5,nEnsemblesM,nSamplesM,RmErrMatrix);

Running the toy model

Here lies the primary equation, that represents the stochastic model developed in this work.
These variables correspond to the following variable names in the code.
WI ~
XI ~ X
HNoiseX ~
t_WI_XI ~
WI = zeros(nEnsemblesM,nSubsetSample); % Erroneous estimate at the ionosphere
XI = X(:,tsubset); % True value at the ionosphere
for i=1:1:nEnsemblesM
WI(i,:) = interp1(timeM',X(i,:)'+ HNoiseX(i,:)',t_WI_XI(i,tsubset)','nearest');
end

Output

Estimated polar cap potential

The primary output of the model is or "WI". This is our erroneous estimate of the polar cap electric field. It is estiamted by erroneously propagating the solar wind geoeffective field into the ionosphere, by assuming a constant delay, and no other source of variability. From this output, we can derive many other quantities of interest, that can ultimately explain the non-linearity we observe between and .

Result 1: Saturation of Polar Cap Potential Explained (Figure M2)

Xm2 = interp1(1:length(Xm),Xm',(1:length(Xm))-mean2,'nearest'); %Time-Shifting Em, to mimic the erroneous estimate at the polar cap Em*
EPCCgW = create_curve(Xm2',PCC,XBins);
PCCm = @(E,E_0) E./(sqrt(1+(E./E_0).^2));
EXIgWI = create_curve(WI(:),XI(:),XBins);
The conditional expectation of PCC: from data, matches stunningly with the conditional expectation from the model. As the model is based on only the stochastic properties of PCC, and the stochastic properties of the sources of uncertainty in propagation of to the ionospehere, this result implies that the observed non-linear saturation of the polar cap potential with geoeffective field can be explained by:
  1. the stochastic variability of propagation delay of the solar wind information from L1 to Ionosphere ~
  2. random scattering of the magnitude of the geoeffective field between L1 and Ionosphere ~
figM2=figure;
resize_figure(figM2,2*55,2*55);
p0 = plot(XBins,XBins,'-k');
hold on;
p4 = plot_curve(EPCCgW,[0 0.5 0]); % data
p4.LineStyle='--';
ylim([0,40]);
ylabel('Polar cap index response');
set(gca,'YColor',[0 0 0]);
p6 = plot_curve(EXIgWI,'m'); % model
ylim([0,40]);
xlim([0,40]);
xlabel({'$\hat{E}^*_m \ , \ \hat{X}^*$ [mV/m]','Erroneous solar wind reconnection electric field'});
legend([p4,p6,p0],...
'Data: $<E_{PC}|\hat{E}^{*}_{m}>$', ...
'Model: $<\hat{X}|\hat{X}^*>$',...
'45$^\circ$ line', ...
'Interpreter',"latex",'Location','best');
title('Saturation of Polar Cap Potential Explained','Interpreter',"none");
Figure 2: Our model reproduces the observed saturation of polar cap index extremely well.
export_fig([outputFolder,'FigureM2.png'], ...
'-r600','-png','-nocrop',figM2);
export_fig([outputFolder,'FigureM2.pdf'], ...
'-r600','-pdf','-nocrop',figM2);
Warning: export_fig currently supports transparent patches/areas only in PNG output.
To export transparent patches/areas to PDF, use the print command:
print(gcf, '-dpdf', 'G:\My Drive\Research\Projects\Paper 6\Draft\Version 3\Science\Figures\New1\FigureM2.pdf');

Validating the model

Normalized marginal error distribution (does not match)

By comparing how well the variability between PCC and is captured, by X and , we can validate the stochastic model.
; Normalized
Overall, the plot of the marginal error distribution below suggests that the model is still underestimating the uncertainty as compared to the data. One region of error it does not capture yet is the following.
  1. The lower X values values have far more uncertainty in observations than in the model. This could be because the assumed lognormal distribution for X, does not capture the frequency of X close to 0.
Eitherway, the plot demonstrates we do not at least overestimate the errors. That is our model is a conservative one, according to the marginal error distribution. However, the marginal error distributions do not tell us much about what is really going on.
f5=figure;
resize_figure(f5,2*55,2*55);
xBin3 = -5:0.1:5;
histogram((Xm2(:)-PCC(:))./PCC(:),xBin3,'Normalization','pdf', ...
'DisplayStyle','stairs','LineWidth',1,'EdgeColor','k');
hold on;
histogram((WI(:)-XI(:))./XI(:),xBin3,'Normalization','pdf', ...
'DisplayStyle','stairs','LineWidth',1,'EdgeColor','m');
xlabel('Normalized error $(\hat{X}^* - \hat{X})/\hat{X}$');
ylabel('pdf');
legend('Data','Model');
title('Normalized marginal error distribution','Interpreter',"none");

Normalized conditional error distributions of data and model match (Figure S7)

[PegX,XPegX,YPegX] = conditional_pdf((WI(:)-XI(:))./XI(:),XI(:),-2:0.1:2,0:1:20);
[Pe2gX,XPe2gX,YPe2gX] = conditional_pdf((Xm2(:)-PCC(:))./PCC(:),PCC(:),-2:0.1:2,0:1:20);
Below we plot a more complete picture of the uncertainties: the conditional probability density of the normalized error, , for both data and model, varying with If the variance (or spread) of this normalized conditional error distribution was constant with X, instead of decreasing linearly with X, it would mean that the "error" or variability is heteroscedastic as opposed to homoscedastic. Homoscedasticity is the general assumption when it comes to error distributions. As you see from the plot below, this is not a good assumption here, and we need the model to capture the heteroscedacity.
In our case, the spread of the normalized error distribution increases with X, defined approximately by displayed in a previous figure.
The conditional probability density of the normalized error (Left) is broadly captured by the model (Right).
figS7=figure;
resize_figure(figS7,2*55,2*120);
t=tiledlayout(1,2,'TileSpacing','loose');
title(t,'Comparing the Normalized Error Distribution between data and model');
nexttile
plot_2D_error(XPe2gX, YPe2gX, Pe2gX,'$P(\hat{E}^*_m-E_{PC}/E_{PC}|E_{PC})$');
Warning: Unable to set 'Position', 'InnerPosition', 'OuterPosition', or 'ActivePositionProperty' for objects in a TiledChartLayout
caxis([0,1]);
ylabel('$E_{PC}$ [mV/m]','Interpreter','latex');
xlabel('$(\hat{E}^*_{m}-E_{PC})/E_{PC}$','Interpreter','latex');
xlim([-2,2]);
title('Data');
nexttile
plot_2D_error(XPegX, YPegX, PegX,'$P(\hat{X}^*-\hat{X}/\hat{X}|\hat{X})$');
Warning: Unable to set 'Position', 'InnerPosition', 'OuterPosition', or 'ActivePositionProperty' for objects in a TiledChartLayout
caxis([0,1]);
xlim([-2,2]);
colormap(inferno);
title('Model');
ylabel('$X$','Interpreter','latex');
xlabel('$ (\hat{X}^* -\hat{X} )/\hat{X}$','Interpreter','latex');
export_fig([outputFolder,'FigureS7.png'], ...
'-r600','-png','-nocrop',figS7);
Figure S7: Conditional normalized error distribution

Probability distributions match (Figure S1, S5)

In the following figure, in panel 1, we compare and . As is an input to the model and inspired from , there is no surprise that they are similar.
However, in the second panel from data is shown with which is calculated by the model. The fact that they are very similar, should give us more confidence in the validity of our assumed stochastic model between and X. Notice the bump between 10 and 30 mV/m reproduced in the model.
figS1=figure;
XBinsLog = logspace(-1,2,100);
resize_figure(figS1,2*55,2*55);
nexttile
histogram(PCC,XBinsLog,'Normalization','pdf','DisplayStyle','stairs','LineWidth',1,'EdgeColor','k');
hold on;
histogram(XI,XBinsLog,'Normalization','pdf','DisplayStyle','stairs','LineWidth',1,'EdgeColor','m');
hold on;
set(gca,'XScale','log','YScale','log');
ylim([10^-10,100]);
xlim([10^-1,100]);
legend('Data: $pdf(E_{PC})$','Model input: $pdf( \hat{X} )$','Interpreter','latex')
xlabel('$E_{PC} \ , \ \hat{X}$ [mV/m]');
ylabel('$pdf$');
title('Comparing model input to data');
figS5=figure;
resize_figure(figS5,2*55,2*55);
histogram(Xm,XBinsLog,'Normalization','pdf','DisplayStyle','stairs','LineWidth',1,'EdgeColor','k');
hold on;
histogram(WI,XBinsLog,'Normalization','pdf','DisplayStyle','stairs','LineWidth',1,'EdgeColor','m');
set(gca,'XScale','log','YScale','log');
ylim([10^-10,100]);
xlim([10^-1,100]);
legend('Data: $ pdf(\hat{E}^*_m) $ ','Model: $ pdf(\hat{X}^*) $','Interpreter','latex');
xlabel('$ \hat{E}^*_m \ , \ \hat{X}^{*} $ [mV/m]');
ylabel('$pdf$');
title('Comparing model output to data');
export_fig([outputFolder,'FigureS1.png'], ...
'-r600','-png','-nocrop',figS1);
export_fig([outputFolder,'FigureS1.pdf'], ...
'-r600','-pdf','-nocrop',figS1);
Warning: export_fig currently supports transparent patches/areas only in PNG output.
To export transparent patches/areas to PDF, use the print command:
print(gcf, '-dpdf', 'G:\My Drive\Research\Projects\Paper 6\Draft\Version 3\Science\Figures\New1\FigureS1.pdf');
Figure S1: Input to the model has a similar pdf as the of , proxy for the true electric field .
export_fig([outputFolder,'FigureS5.png'], ...
'-r600','-png','-nocrop',figS5);
export_fig([outputFolder,'FigureS5.pdf'], ...
'-r600','-pdf','-nocrop',figS5);
Warning: export_fig currently supports transparent patches/areas only in PNG output.
To export transparent patches/areas to PDF, use the print command:
print(gcf, '-dpdf', 'G:\My Drive\Research\Projects\Paper 6\Draft\Version 3\Science\Figures\New1\FigureS5.pdf');
Figure S5: Output of the model matches with data as well.

Slope of conditional expectations match

The slope of the conditional expectations shown in the main result, demonstrates the nature of the non-linear relationship between PCC and . And it turns out the model predicts the behavior of this slope, varying with X, almost exactly as observed in the data, up to 30 mV/m, after which the data is too sparse to take the slopes seriously.
fig3=figure;
resize_figure(fig3,2*55,2*55);
plot(EPCCgW.XBins(1:end-1),smooth(diff(EPCCgW.YgX),10),'--k');
set(gca,'YColor','k');
ylabel('$\frac{d<PCC|\hat{E}^*_m>}{d\hat{E}^*_m}$');
ylim([-0.4,1]);
xlim([0,35]);
grid on
hold on;
plot(EXIgWI.XBins(1:end-1),smooth(diff(EXIgWI.YgX),10),'m');
ylim([-0.4,1]);
xlabel('$E_{PC}\ , \ X$');
xlim([0,35]);
legend('Data','Model','Location','best');
title({'Average slope of the conditional expectations','are very similar'},'Interpreter',"none");

Autocorrelation functions match

Finally, we compare the autocorrelation of the time series between the data and its corresponding counterparts in the model. Panel 1, compares ACF of PCC and X. But since the ACF of X is an input to the model inspired from PCC, there is no surprise in their similarity. However, in panel 2, the similarity between the ACF of and , which is an output of the model, is quite striking. And it again ought to increase our confidence in the validity of the stochastic model.
M = interp_nans(WI')';
[RArrayXN,lagXN] = find_correlation(M-nanmean(M,1),nSubsetSample,nEnsemblesM,1);
acf_XN = mean(RArrayXN(:,1:2^12)./RArrayXN(:,1));
lag_XN = lagXN(1:2^12);
acf_fit_XN = fit(lag_XN,acf_XN','spline'); % Fitting the ACF with a spline!
M = interp_nans(XI')';
[RArrayXI,lagXI] = find_correlation(M-nanmean(M,1),nSubsetSample,nEnsemblesM,1);
acf_XI = mean(RArrayXI(:,1:2^12)./RArrayXI(:,1));
lag_XI = lagXI(1:2^12);
acf_fit_XI = fit(lag_XI,acf_XI','spline'); % Fitting the ACF with a spline!
figS2=figure;
resize_figure(figS2,2*55,2*55);
plot(tInd./60,acf_fit_PCC(tInd),'--k');
hold on;
plot(tInd./60,acf_fit_XI(tInd),'-m');
ylim([0,1]);
xlim([-1,14]);
legend('Data: $<PCC(t) PCC(t + \tau)>$','Model: $<\hat{X}(t) \hat{X}(t+\tau)>$','Interpreter','latex');
title('Comparing ACFs of model input and data');
xlim([-1, 14]);
ylabel('Autocorrelation Functions');
xlabel('$\tau$ [Hours]');

ACF of model input is made to be similar to the data (Figure S2)

export_fig([outputFolder,'FigureS2.png'], ...
'-r600','-png','-nocrop',figS2);
export_fig([outputFolder,'FigureS2.pdf'], ...
'-r600','-pdf','-nocrop',figS2);
fig31=figure;
resize_figure(fig31,2*55,2*55);
tInd = 0:1000;
plot(tInd./60,acf_fit_Xm(tInd),'--k');
hold on;
plot(tInd./60,acf_fit_XN(tInd),'-m');
ylim([0,1]);
xlim([-1,14]);
xlabel('$\tau$ [Hours]');
ylabel('Autocorrelation Functions');
legend('Data: $<\hat{E}^*_m(t)\hat{E}^*_m(t + \tau)>$','Model: $<\hat{X}^*(t) \hat{X}^*(t+ \tau)>$','Interpreter','latex');
title('Comparing ACFs of model output and data');

Variation of variability (or error) match

Looking at some summary statistics of the normalized conditional error distribution, we can get a better picture of the model vailidity with respect to how well it captures the variation of variability/error with X. In the figure below, that provides three stunning similarities between model and data.
Panel 1: Spread of the normalized error distribution varies with X in the same manner for data and model. The model underestimates the spread in error for X values < 10 mV/m. This was noted earlier as a reason for why the marginal probability distributions of error don't match well, between data and model.
Panel 2 and Panel 3: Show the bias in the variability or error, and how it increases with The model captures this bias perfectly well. And this bias is a result of the uncertainty in the propagation delay (), and the heteroscedastic variability .
EeXIgWI = create_curve(WI(:),WI(:)-XI(:),XBins);
EeXIgWIs = create_curve(Xm2(:),Xm2(:)-PCC(:),XBins);
fig4=figure;
resize_figure(fig4,2*120,2*120);
t=tiledlayout(2,2,"TileSpacing","compact");
nexttile
yyaxis left
plot(EeXIgWIs.XBins,EeXIgWIs.stdYgX,'k');
ylim([0,15]);
ylabel('$\Sigma(\hat{E}^*_m-PCC)$');
set(gca,'YColor','k');
yyaxis right
plot(EeXIgWI.XBins,EeXIgWI.stdYgX,'m');
ylim([0,15]);
set(gca,'YColor','m');
ylabel('$ \Sigma(\hat{X}^*-\hat{X}) $');
title('Std. deviation of error','Spread of the error distribution','Interpreter',"none");
legend('Data','Model');
xlabel('$\hat{E}^*_m \ , \ \hat{X}^*$ [mV/m]');
nexttile
yyaxis left
plot(EeXIgWIs.XBins,100.*EeXIgWIs.stdYgX./EeXIgWIs.XBins,'k');
ylim([0,100]);
ylabel('$\Sigma(\hat{E}^*_m-PCC) / PCC$ [\%]');
set(gca,'YColor','k');
yyaxis right
plot(EeXIgWI.XBins,100.*EeXIgWI.stdYgX./EeXIgWI.XBins,'m');
ylim([0,100]);
set(gca,'YColor','m');
ylabel('$\Sigma(\hat{X}^*-\hat{X}) / \hat{X} $ [\%]');
title('Std. deviation of normalized error','Spread of the normalized error distribution','Interpreter',"none");
legend('Data','Model');
xlabel('$\hat{E}^*_m \ , \ \hat{X}^*$ [mV/m]');
nexttile
yyaxis left
plot(EeXIgWIs.XBins,100.*EeXIgWIs.YgX./EeXIgWIs.XBins,'k');
ylim([-20,100]);
ylabel('$mean(\hat{E}^*_m-PCC) / PCC$ [\%]');
set(gca,'YColor','k');
yyaxis right
plot(EeXIgWI.XBins,100.*EeXIgWI.YgX./EeXIgWI.XBins,'m');
ylim([-20,100]);
set(gca,'YColor','m');
ylabel('$mean(\hat{X}^*-\hat{X}) / \hat{X} $ [\%]');
title('Mean of normalized error','Normalized Bias','Interpreter',"none");
legend('Data','Model');
xlabel('$\hat{E}^*_m \ , \ \hat{X}^*$ [mV/m]');
nexttile
yyaxis left
plot(EeXIgWIs.XBins,EeXIgWIs.YgX,'k');
ylim([-5,40]);
ylabel('$mean(\hat{E}^*_m-PCC)$ [mV/m]');
set(gca,'YColor','k');
yyaxis right
plot(EeXIgWI.XBins,EeXIgWI.YgX,'m');
ylim([-5,40]);
set(gca,'YColor','m');
ylabel('$mean(\hat{X}^*-\hat{X})$ [mV/m]');
title('Mean of error','Bias','Interpreter',"none");
legend('Data','Model');
xlabel('$\hat{E}^*_m \ , \ \hat{X}^*$ [mV/m]');

Predicting nature of random error between estimate and true value (Figure S6)

figS6=figure;
resize_figure(figS6,2*55,2*55);
plot(EeXIgWIs.XBins,100.*EeXIgWIs.stdYgX./EeXIgWIs.XBins,'k');
ylim([0,100]);
ylabel('$\Sigma(\hat{E}^*_m-PCC) / PCC$ [\%]');
set(gca,'YColor','k');
hold on;
plot(EeXIgWI.XBins,100.*EeXIgWI.stdYgX./EeXIgWI.XBins,'m');
ylim([0,100]);
title('Std. deviation of normalized error','Spread of the normalized error distribution','Interpreter',"none");
legend('Data','Model');
xlabel('$\hat{E}^*_m \ , \ \hat{X}^*$ [mV/m]');
export_fig([outputFolder,'FigureS6.png'], ...
'-r600','-png','-nocrop',figS6);
export_fig([outputFolder,'FigureS6.pdf'], ...
'-r600','-pdf','-nocrop',figS6);

Non-linear conditional bias (Figure S8)

The model reproduces the non-linear conditional bias observed in the data extraordinarily well, by merely assuming random uncertainty in the estimate of the solar wind electric field at the polar cap.
figS8=figure;
resize_figure(figS8,2*55,2*55);
plot(EeXIgWIs.XBins,EeXIgWIs.YgX,'k');
ylim([-5,40]);
ylabel('$mean(\hat{E}^*_m-PCC)$ [mV/m]');
set(gca,'YColor','k');
hold on;
plot(EeXIgWI.XBins,EeXIgWI.YgX,'m');
ylim([-5,40]);
title('Mean of error','Bias','Interpreter',"none");
legend('Data','Model');
xlabel('$\hat{E}^*_m \ , \ \hat{X}^*$ [mV/m]');
export_fig([outputFolder,'FigureS8.png'], ...
'-r600','-png','-nocrop',figS8);
export_fig([outputFolder,'FigureS8.pdf'], ...
'-r600','-pdf','-nocrop',figS8);

Explaining the origin of the non-linear bias (Figure S9)

The non-linear increase in bias that you see in the above plots, is a result of the tranformation of X into , despite the zero-mean random nature of propagation delay uncertainity, and the intrinsic scatter.
Panel (Left): In the plot below, the left panel shows the probability distribution of X, colored in accordance with the value of X.
The stochstic model, transforms X into .
Panel (Right): The right panel shows the probabiliy distribution of , the transformed variable, but with the color still in accordance with the value of - the input to the model.
As you can see, a large proportional of small X values, are now associated with large values of . Naturally, this brings down the value of X given a particular closer to the mean. This explains the bias.
Why is the bias non-linear?
  1. The fact that there are far more values of X at lower mangitudes, than higher magnitudes, due to the underlying probability distribution of X being log-normal.
  2. And the heteroscedastic nature of the scattering of X (which implies a multiplicative error model, see Carroll et al., 2006, Sec 4.5.2, that can cause a regression attenuation (bias) that is non-linear)
The above two facts are the reasons behind the non-linearity in the observed bias in the conditional expectation and also .
nBins = length(XBinsLog);
[edges1, value1] = create_segmented_pdfs(XI,XI,XBinsLog,XBinsLog);
cf = inferno(nBins);
[edges2, value2] = create_segmented_pdfs(XI,WI,XBinsLog,XBinsLog);
figS9 = figure;
resize_figure(figS9,2*55,2*120);
t=tiledlayout(1,2,"TileSpacing","tight");
title(t,'Explaining the statistical origin of the non-linear bias','Interpreter','none','FontSize',16);
nexttile
bar(log10(edges1),value1','BarLayout','stacked','BarWidth',1,'EdgeColor','none');
set(gca,'XScale','linear','YScale','log',...
'XTick',log10([0.1,1,4,10,50]),'XTickLabel',{'0.1','1','4','10','50'});
colororder(cf);
xlabel('$\hat{X}$ [mV/m]');
ylabel('$pdf(\hat{X})$ [Count Density]');
title('Model input');
xlim([-1,2]);
ylim([10^-2,10^8]);
set(gca,'FontSize',14);
nexttile
bar(log10(edges2),value2','BarLayout','stacked','BarWidth',1,'EdgeColor','none');
set(gca,'XScale','linear','YScale','log',...
'XTick',log10([0.1,1,4,10,50]),'XTickLabel',{'0.1','1','4','10','50'}, ...
'YTickLabel',[]);
colororder(cf);
colormap(cf);
ylim([10^-2,10^8]);
xlim([-1,2]);
xlabel('$\hat{X}^*$ [mV/m]');
title({'Model output: proportion of','corresponding inputs in color'});
set(gca,'FontSize',14);
cb=colorbar_thin('YLabel','$\hat{X}$');
Warning: Unable to set 'Position', 'InnerPosition', 'OuterPosition', or 'ActivePositionProperty' for objects in a TiledChartLayout
caxis(log10([edges1(1)-0.01,edges1(end)]));
cb.Ticks=log10([0.1, 1, 4, 10, 50]);
cb.TickLabels=([0.1, 1, 4, 10, 50]);
export_fig([outputFolder,'FigureS9.png'], ...
'-r600','-png','-nocrop',figS9);
export_fig([outputFolder,'FigureS9.pdf'], ...
'-r600','-pdf','-nocrop',figS9);

Non-linear bias in the Conditional PDFs

The conditional probability distributions, also show the non-linear bias clearly, in the data and the model. The left panel of the figure below, shows the conditional probability distribution of derived from data, and the conditional pdf of from the model. The similarities are visible between the distributions.
[PXgW,XPXgW,YPXgW] = conditional_pdf(XI(:),WI(:),0:1:40,0:1:40);
[PXgWs,XPXgWs,YPXgWs] = conditional_pdf(PCC(:),Xm(:),0:1:40,0:1:40);
fig6=figure;
resize_figure(fig6,2*55,2*120);
t=tiledlayout(1,2);
title(t,'Non-linear bias in the Conditional Probability Densities','Interpreter','none');
nexttile
colormap(inferno);
plot_2D_error(XPXgWs, YPXgWs, PXgWs','$pdf(E_{PC}|\hat{E}^*_m)$');
Warning: Unable to set 'Position', 'InnerPosition', 'OuterPosition', or 'ActivePositionProperty' for objects in a TiledChartLayout
caxis([0,0.5]);
ylabel('$E_{PC}$','Interpreter','latex');
xlabel('$\hat{E}^*_m$');
hold on;
p3 = plot(XBins,XBins,'c');
hold on;
p4 = plot_curve(EPCCgW,'g');
xlim([0,40]);
ylim([0,40]);
legend([p3,p4],{'$<E_{PC}|\hat{E}_m>$','$<E_{PC}|\hat{E}^*_m>$'},'Location','best','Interpreter','latex');
title('Data: $pdf(E_{PC}|\hat{E}_m)$');
nexttile
colormap(inferno);
plot_2D_error(XPXgW, YPXgW, PXgW','$pdf(\hat{X}|\hat{X}^*)$');
Warning: Unable to set 'Position', 'InnerPosition', 'OuterPosition', or 'ActivePositionProperty' for objects in a TiledChartLayout
caxis([0,0.5]);
ylabel('$\hat{X}$','Interpreter','latex');
xlabel('$\hat{X}^*$');
hold on;
p3 = plot(XBins,XBins,'c');
hold on;
p4 = plot_curve(EXIgWI,'m');
xlim([0,40]);
ylim([0,40]);
legend([p3,p4],{'$<\hat{X}|\hat{X}>$','$<\hat{X}|\hat{X}^*>$'},'Location','best','Interpreter','latex');
title('Model: $pdf(\hat{X}|\hat{X}^*)$');

Normalized conditional error distribution, conditioned on X*

The error distirbution plots differ based on the variable it is conditioning on. Here you can see the non-linear bias, shown here.
[PegW,XPegW,YPegW] = conditional_pdf((WI(:)-XI(:))./XI(:),WI(:),-2:0.1:2,0:1:20);
[Pe2gW,XPe2gW,YPe2gW] = conditional_pdf((Xm2(:)-PCC(:))./PCC(:),Xm2(:),-2:0.1:2,0:1:20);
figA1=figure;
resize_figure(figA1,2*55,2*120);
t=tiledlayout(1,2);
title(t,'Comparing the Normalized Error Distribution between data and model');
nexttile
plot_2D_error(XPe2gW, YPe2gW, Pe2gW,'$P(\hat{E}^*_m-E_{PC}/E_{PC}|\hat{E}^*_m)$');
Warning: Unable to set 'Position', 'InnerPosition', 'OuterPosition', or 'ActivePositionProperty' for objects in a TiledChartLayout
caxis([0,1]);
ylabel('$\hat{E}^*_{m}$ [mV/m]','Interpreter','latex');
xlabel('$(\hat{E}^*_{m}-E_{PC})/E_{PC}$','Interpreter','latex');
xlim([-2,2]);
title('Data');
nexttile
plot_2D_error(XPegW, YPegW, PegW,'$P(\hat{X}^*-\hat{X}/\hat{X}|\hat{X}^*)$');
Warning: Unable to set 'Position', 'InnerPosition', 'OuterPosition', or 'ActivePositionProperty' for objects in a TiledChartLayout
caxis([0,1]);
xlim([-2,2]);
colormap(inferno);
title('Model');
ylabel('$\hat{X}^*$','Interpreter','latex');
xlabel('$ (\hat{X}^* -\hat{X} )/\hat{X}$','Interpreter','latex');

Result 2: Saturation of Auroral Currents Explained (as well)

For the sake of argument, if we assume, the auroral currents increase linearly with electric field in the ionosphere, such as , then the conditional expectation will non-linearly increase with . And this non-linearity, is exactly similar to what we observe from data.
In the plot below, the green-line, is the conditional expectation of the auroral electrojet strength given time-shifter solar wind geoeffective field: . This matches exactly with the model prediction , the magneta line, provided we make the above assumption of a linear relationship between Y (model proxy for SML) and X (model proxy for , effective ionospheric electric field).
This result implies, that the non-linear relationship between auroral electrojet current and geoeffective field, can be explained by our uncertainties in propagating geoeffective field from L1 measurements to the ionosphere - all the while the ionospheric current is a linear function of the effective ionospheric electric field or polar cap electric field.
YS = Fsml(datenum(XmTime));
EYsgWs = create_curve(Xm2(:),YS(:),XBins);
EYsgPCC = create_curve(PCC(:),YS(:),XBins);
Y = -120.*XI + random('Normal',0,5,nSubsetSample,nEnsemblesM)';
EYgW = create_curve(WI(:),Y(:),XBins);
EYgX = create_curve(XI(:),Y(:),XBins);
fig7 =figure;
resize_figure(fig7,2*55,2*55);
yyaxis right
p1 = plot_curve(EYgX,'k');
hold on;
p2 = plot_curve(EYgW,'m');
set(gca,'YColor','m');
p2.LineStyle='-';
ylabel('$<Y|\hat{X}^*>$');
ylim([-4000,0]);
yyaxis left
p3 = plot_curve(EYsgWs,[0 0.5 0]);
p3.LineStyle='--';
set(gca,'YColor',[0 0.5 0]);
ylim([-4000,0]);
xlabel('$\hat{E}^*_m \ , \ \hat{X}^*$ [mV/m]','Interpreter','latex');
ylabel('$<SML|\hat{E}^*_m>$ [nT]','Interpreter','latex');
legend([p3,p2,p1],{'Data: $<SML|\hat{E}^*_m>$ ','Model: $<Y|\hat{X}^*>$ ','Model without error : $<Y|\hat{X}>$'},'Location','best','Interpreter','latex');
title('Saturation of Auroral Currents Explained','Interpreter',"none");
Warning: Unable to set 'Position', 'InnerPosition', 'OuterPosition', or 'ActivePositionProperty' for objects in a TiledChartLayout
Warning: Unable to set 'Position', 'InnerPosition', 'OuterPosition', or 'ActivePositionProperty' for objects in a TiledChartLayout

Correcting for the bias: Regression calibration

When we have the conditional expectation of the true value given an erroneous estiamte such as , such as what we obtained in result 1 and result 2, these can be considered the best estimate of the true values given the erroroneous estimate. can be used to statistically correct to its most likely estimate of the true value . Hence, the input solar wind forcing variable (X-axis in result 1 and 2) can be resplaced with the statistically corrected variable . This is simply . We carry this out by interpolating the best estimates of the model true values given model erroneous values when .
% Calibrating Em
Xm2C=interp1(EXIgWI.XBins,EXIgWI.YgX,Xm2(:),'linear'); % Regression calibration; Stretching the X-axis according to the regression function <X|X*>
[PXgWCs,XPXgWCs,YPXgWCs] = conditional_pdf(PCC(:),Xm2C(:),0:1:40,0:1:40);
EPCCgWC = create_curve(Xm2C,PCC,XBins);
figA5=figure;
resize_figure(figA5,2*55,2*120);
t=tiledlayout(1,2);
nexttile
colormap(inferno);
PXgWs(isinf(PXgWs)|PXgWs>100)=0;
plot_2D_error(XPXgWs, YPXgWs, PXgWs','$pdf(E_{PC}|\hat{E}^*_m)$');
Warning: Unable to set 'Position', 'InnerPosition', 'OuterPosition', or 'ActivePositionProperty' for objects in a TiledChartLayout
caxis([0,0.5]);
ylabel('$E_{PC}','Interpreter','latex');
xlabel('$\hat{E}^*_m$');
hold on;
p3 = plot(XBins,XBins,'c');
hold on;
p4 = plot_curve(EPCCgW,'g');
xlim([0,40]);
ylim([0,40]);
legend([p3,p4],{'$<E_{PC}|\hat{E}_m>$','$<E_{PC}|\hat{E}^*_m>$'},'Location','best','Interpreter','latex');
title('Data: $pdf(E_{PC}|\hat{E}^*_m)$');
nexttile
colormap(inferno);
PXgWCs(isinf(PXgWCs)|PXgWCs>100)=0;
plot_2D_error(XPXgWCs, YPXgWCs, PXgWCs','$pdf(E_{PC}|\hat{E}^c_m)$');
Warning: Unable to set 'Position', 'InnerPosition', 'OuterPosition', or 'ActivePositionProperty' for objects in a TiledChartLayout
caxis([0,0.5]);
ylabel('$E_{PC}$','Interpreter','latex');
xlabel('$\hat{E}^c_m$');
hold on;
p3 = plot(XBins,XBins,'c');
hold on;
p4 = plot_curve(EPCCgWC,'g');
xlim([0,40]);
ylim([0,40]);
legend([p3,p4],{'$<E_{PC}|\hat{E}_m>$','Calibratied $<E_{PC}|\hat{E}^c_m>$'},'Location','best','Interpreter','latex');
title('Calibrated Data: $pdf(E_{PC}|\hat{E}^c_m)$');
Warning: Error updating Text.

String scalar or character vector must have valid interpreter syntax:
$E_{PC}

Corrected polar cap potential response and auroral electrojet response (Figure M3)

The corrected values, now demonstrate a linear correlation between solar wind forcing, and polar cap potential and auroral electrojet response!
EYsgEmC = create_curve(Xm2C(:),YS(:),XBins);
figM3 =figure;
resize_figure(figM3,2*55,2*120);
t=tiledlayout(1,2,"TileSpacing","compact");
title(t,'Correcting error in solar wind measurement reveals:','Interpreter','none')
nexttile
p1 = plot(XBins,XBins,'k');
hold on;
p2 = plot_curve(EPCCgWC,'m');
set(gca,'YColor','m');
p2.LineStyle='-';
ylabel('$<Y|\hat{X}^*>$');
ylim([0,40]);
p3 = plot_curve(EPCCgW,[0 0.5 0]);
p3.LineStyle='--';
set(gca,'YColor',[0 0 0],'YTick',[0,10,20,30,40]);
xlabel('$\hat{E}^*_m \ , \ \hat{E}^c_m $[mV/m]','Interpreter','latex');
ylabel('Polar cap potential $E_{PC}$ [mV/m]','Interpreter','latex');
legend([p3,p2,p1],{'Erroneous Data: $<E_{PC}|\hat{E}^*_m>$ ','Calibrated Data: $<E_{PC}|\hat{E}^c_m>$ ','True linear relationship'},'Location','best','Interpreter','latex');
title('A) Linearity in polar cap potential response','Interpreter',"none");
nexttile
p1 = plot_curve(EYgX,'k');
hold on;
p2 = plot_curve(EYsgEmC,'m');
set(gca,'YColor','m');
p2.LineStyle='-';
ylabel('$<Y|\hat{X}^*>$');
ylim([-4000,0]);
p3 = plot_curve(EYsgWs,[0 0.5 0]);
p3.LineStyle='--';
set(gca,'YColor',[0 0 0],'YTick',[-4000,-3000,-2000,-1000,0]);
xlabel('$\hat{E}^*_m \ , \ \hat{E}^c_m$ [mV/m]','Interpreter','latex');
ylabel('Auroral electrojet strength $SML$ [nT]','Interpreter','latex');
legend([p3,p2,p1],{'Erroneous Data: $<SML|\hat{E}^*_m>$ ','Calibrated Data: $<SML|\hat{E}^c_m>$ ','True linear relationship'},'Location','best','Interpreter','latex');
title('B) Linearity in auroral current response','Interpreter',"none");
export_fig([outputFolder,'FigureM3.png'], ...
'-r600','-png','-nocrop',figM3);
export_fig([outputFolder,'FigureM3.pdf'], ...
'-r600','-pdf','-nocrop',figM3);
Warning: export_fig currently supports transparent patches/areas only in PNG output.
To export transparent patches/areas to PDF, use the print command:
print(gcf, '-dpdf', 'G:\My Drive\Research\Projects\Paper 6\Draft\Version 3\Science\Figures\New1\FigureM3.pdf');

Functions

function [y] = f(x,b,c)
f(): - capturing the heteroscedastic nature of intrinsic scatter in the propagation of at L1 to the ionosphere.
y = c.*((x/40).^b);
y(x>12) = c.*((12/40).^b);
end
function [wind, pci, Fsml, Em, time, Fpcc, acf_fit_E, acf_fit_PCC, acf_fit_var]= initialize(superMagFile,windDataFolder,polarCapDataFolder)
Initialize() : Extracts all relevant data.
% Extract SML values
superMag = extract_superMag(superMagFile);
Fsml = griddedInterpolant(datenum(superMag.datetime),superMag.SML,'linear','none');
% Extract WIND measurements, especially Em values
wind = extract_set_of_data(windDataFolder,'wind');
% Extract polar cap index
pci = extract_set_of_data_PCI(polarCapDataFolder);
pci.PCN(pci.PCN>=999)=nan;
pci.PCS(pci.PCS>=999)=nan;
temp.PCN=pci.PCN;
temp.PCS=pci.PCS;
temp.PCN(temp.PCN<=0)=0;
temp.PCS(temp.PCS<=0)=0;
pci.PCC = 0.5.*(temp.PCN + temp.PCS); % Calculating PCC from PCN and PCS
Fpcc = griddedInterpolant(datenum(pci.datetime),pci.PCC,'linear','none');
%Extracting dataset when both PCI and Wind measurements are available
[~,piI,wiI]=intersect(pci.datetime,wind.datetime); % Simultaneous time from WIND, GEOTAIL and PCI
wind2 = wind(wiI,:);
Em = wind2.E_kl.*10^-3; % How is EKL calculated is important
time = wind2.datetime; % Common time
PCC = Fpcc(datenum(time));
error = Em-PCC;
% Finding ACF of X, from data; %Explain how ACF is calculated in reference
% Calculating ACF of Xm
[Me, nSample, nEnsemble] = split_series(wind.E_kl.*10^-3,2^13);
[RArraye,lage] = find_correlation(Me-nanmean(Me,1),nSample,nEnsemble,1);
acf_E = mean(RArraye(:,1:2^12)./RArraye(:,1));
lag_E = lage(1:2^12);
acf_fit_E = fit(lag_E,acf_E','spline'); % Fitting the ACF with a spline!
% Calculating ACF of PCC
[Me1, nSample1, nEnsemble1] = split_series(pci.PCC,2^13);
[RArraye1,lage1] = find_correlation(Me1-nanmean(Me1,1),nSample1,nEnsemble1,1);
acf_PCC = mean(RArraye1(:,1:2^12)./RArraye1(:,1));
lag_PCC = lage1(1:2^12);
acf_fit_PCC = fit(lag_PCC,acf_PCC','spline'); % Fitting the ACF with a spline!
% Calculating ACF of Error
[Me2, nSample2, nEnsemble2] = split_series(error,2^13);
[RArraye2,lage2] = find_correlation(Me2-nanmean(Me2,1),nSample2,nEnsemble2,1);
acf_var = mean(RArraye2(:,1:2^12)./RArraye2(:,1));
lag_var = lage2(1:2^12);
acf_fit_var = fit(lag_var,acf_var','spline'); % Fitting the ACF with a spline!
end
function [edges, value] = create_segmented_pdfs(X,Y,XBins,XSegments,normalizationStr)
create_segmented_pdfs(): Creating histograms of probability distributions, which can be colored to show a form of conditional probability distribution. (See figure).
if nargin<5
normalizationStr = 'countdensity';
end
value = zeros(length(XSegments),length(XBins)-1);
for i=1:1:length(XSegments)
if i==1
[value(i,:),edges] = histcounts(Y(X<=XSegments(i)),XBins,'Normalization',normalizationStr);
edges = edges(2:end) - (edges(2)-edges(1))/2;
elseif i>1 && i < length(XSegments)
value(i,:) = histcounts(Y(X>XSegments(i-1) & X<=XSegments(i)),XBins,'Normalization',normalizationStr);
else
value(i,:) = histcounts(Y(X>=XSegments(i-1)),XBins,'Normalization',normalizationStr);
end
end
end
function p = plot_curve(curve, color)
CI1 = interp_nans(curve.CI);
p=plot(curve.XBins, curve.YgX, 'Color', color);
hold on;
plot_ci(curve.XBins,CI1,color,0.2);
end
function plot_ci(x,ci,color,alpha)
hold on;
X2 = [x, fliplr(x)];
inBetween = [ci(:,1)', fliplr(ci(:,2)')];
fill(X2,inBetween,color,'LineStyle','none','FaceAlpha',alpha);
end
function p = plot_2D_error(Y,X,P,yLabel)
p = pcolor(Y,X,P);
set(p,'EdgeColor','none');
colorbar_thin('YLabel',yLabel);
end
function curve = create_curve(X, Y, Ei)
if nargin<3
Ei = 100;
end
Y = Y(:);
Y(Y==999999)=nan;
X1 = X(~isnan(X) & ~isnan(Y));
Y1 = Y(~isnan(X) & ~isnan(Y));
X = X1;
Y = Y1;
[xindx, E] = discretize(X(:),Ei);
for i = 1:max(xindx)
curve.YgX(i) = nanmean(Y(xindx==i));
curve.stdYgX(i) = nanstd(Y(xindx==i));
curve.NSamples(i) = sum(xindx==i & ~isnan(Y));
curve.SEM(i) = nanstd(Y(xindx==i))./sqrt(curve.NSamples(i));
curve.ts(i,:) = tinv([0.025 0.975],curve.NSamples(i)-1);
curve.CI(i,:) = curve.YgX(i) + curve.ts(i,:)*curve.SEM(i);
curve.XBins(i) = 0.5*(E(i)+E(i+1));
end
curve.E = E;
end
function [fXgY,XX,YY] = conditional_pdf(X, Y, gridx, gridy)
X1 = X(~isnan(X) & ~isnan(Y));
Y1 = Y(~isnan(X) & ~isnan(Y));
X = X1;
Y = Y1;
sz = 2^10;
Y = Y(:);
Y(Y==999999)=nan;
[bandwidth,fXY,XX,YY]=kde2d([X,Y],sz,[min(gridx),min(gridy)],[max(gridx),max(gridy)]);
[fY, YY1] = ksdensity(Y,YY(:,1));
fXgY = fXY./repmat(fY,1,sz);
The above formula is basically equation 1, below.
Wiki_conditional_densities.PNG
end
function [M, nSample, nEnsemble]= split_series(series,sampleSize,MaxNoOfMissingValues)
if nargin<3
MaxNoOfMissingValues=1000;
end
series = padarray(series,sampleSize-mod(length(series),sampleSize),'post');
L = length(series);
M0 = reshape(series,sampleSize,[])';
indx=sum(isnan(M0),2)>MaxNoOfMissingValues;
l = 1:1:size(M0,1);
k=1;
for i=l(~indx)
M(k,:) = M0(i,:);
k=k+1;
end
nSample = size(M,2);
nEnsemble = size(M,1);
M = interp_nans(M')';
end
function [RArray,lag]=find_correlation(M,nSample,nEnsemble,sampleTime)
dt = sampleTime;
lag = fftshift(-nSample:1:nSample-1)'.*dt;
xCell = mat2cell(M, ones(1,nEnsemble),nSample);
[RArray] = cellfun(@(x) xcorr(x,'unbiased'),xCell,'UniformOutput',false);
RArray = cell2mat(RArray);
RArray(:,2:end+1) = RArray(:,1:end);
RArray(:,1)=RArray(:,2);
RArray = ifftshift(RArray);
end
function T1 = extract_data(loadFile)
% Extract solar wind data from OMNI High Resolution space-craft specific files
format = '%4f %4f %3f %3f %4f %4f %4.1f %6f %6.2f %6.2f %6.2f %8.2f %8.2f %8.2f %8.2f %8.2f %8.2f %7f %6.2f %8.2f %8.2f %4f %8.1f %8.1f %8.1f %8.1f %7.2f %9.0f %8.2f %8.2f %8.2f %8.2f %8.2f %8.2f %8.2f %7f %7f';
tempData = load_ascii_files(loadFile, format, 0);
T.datetime = datetime(tempData{1},1,tempData{2},tempData{3},tempData{4},zeros(size(tempData{4})));
T.timeshift = tempData{8}; T.timeshift(T.timeshift==999999) = nan;
T.BxGSM = tempData{13}; T.BxGSM(T.BxGSM > 9999) = nan;
T.ByGSM = tempData{16}; T.ByGSM(T.ByGSM > 9999) = nan;
T.BzGSM = tempData{17}; T.BzGSM(T.BzGSM > 9999) = nan;
T.velocity = tempData{23}; T.velocity(T.velocity > 99999) = nan;
T.B_T = (T.ByGSM.^2 + T.BzGSM.^2).^0.5;
T.theta_kl = wrapTo2Pi(atan2(T.B_T,T.BzGSM));
T.E_kl = T.velocity.*T.B_T.*(sin(T.theta_kl/2)).^2;
T.E_sw = T.velocity.*T.BzGSM;
T1 = struct2table(T);
end
function T1 = extract_superMag(loadFile)
% Extract auroral electrojet strength data from SuperMag database
% files
format = '%4f %2f %2f %2f %2f %2f %7.6f %7.6f';
tempData = load_ascii_files(loadFile, format, 105);
T.datetime = datetime(tempData{1},tempData{2},tempData{3},tempData{4},tempData{5},tempData{6});
T.SML = tempData{7};
T.SML(T.SML==999999)=nan;
T1 = struct2table(T);
end
function T = extract_set_of_data(loadFolder,satellite)
fileStr = get_files_in_folder(loadFolder,[satellite,'_min_b*.txt']);
for i=1:1:length(fileStr)
T1 = extract_data([loadFolder,fileStr{i}]);
if i==1
T=T1;
else
T = [T;T1];
end
end
end
function T1 = extract_PCI(loadFile)
% Extracting Polar Cap Index values from the PC Index data files
format = '%4f-%2f-%2f %2f:%2f %6.2f %6.2f';
tempData = load_ascii_files(loadFile, format, 1);
T.datetime = datetime(tempData{1},tempData{2},tempData{3},tempData{4},tempData{5},zeros(size(tempData{4})));
T.PCN = tempData{6};
T.PCS = tempData{7};
T1 = struct2table(T);
end
function T = extract_set_of_data_PCI(loadFolder)
fileStr = get_files_in_folder(loadFolder,['pcnpcs*.txt']);
for i=1:1:length(fileStr)
T1 = extract_PCI([loadFolder,fileStr{i}]);
if i==1
T=T1;
else
T = [T;T1];
end
end
end
function data=load_ascii_files(loadFile, format, headerlines)
if nargin<3
headerlines = 1;
end
fileID = fopen(loadFile,'r');
data = textscan(fileID, format, 'headerlines', headerlines);
fclose(fileID);
end

External Functions

Acknowledging the use of two external functions:
1) Stephen Lienhard (2021). Multivariate Lognormal Simulation with Correlation (https://www.mathworks.com/matlabcentral/fileexchange/6426-multivariate-lognormal-simulation-with-correlation), MATLAB Central File Exchange. Retrieved October 7, 2021.
2) Zdravko Botev (2021). kernel density estimation (https://www.mathworks.com/matlabcentral/fileexchange/17204-kernel-density-estimation), MATLAB Central File Exchange. Retrieved October 7, 2021.

Stochastic process from Log-normal distribution

MVLOGNRAND MultiVariant Lognormal random numbers with correlation
function y = MvLogNRand( Mu , Sigma , Simulations , CorrMat )
%MVLOGNRAND MultiVariant Lognormal random numbers with correlation
%
% Mu: The Lognormal parameter Mu (can be column or row vector)
%
% Sigma: The Lognormal parameter Sigma (can be column or row vector)
%
% Simulations: The Number of simulations to run (scalar)
%
% CorrMat: OPTIONAL A square matrix with the number of rows and columns
% equal to the number of elements in Mu/Sigma. Each element on the
% diagonal is equal to one, with the off diagonal cells equal to the
% correlation of the marginal Lognormal distributions. If not specified,
% then assume zero correlation.
%
% To check the simulation run corrcoef(Y) and that should be the same as
% your CorrMat.
%
% REQUIRES THE STATISTICS TOOLBOX
%
% Example:
% Mu = [ 11 12 13 ];
% Sigma = [ .1 .3 .5 ];
% Simulations = 1e6;
% CorrMat = [1 .2 .4 ; .2 1 .5 ; .4 .5 1];
% y = MvLogNRand( Mu , Sigma , Simulations , CorrMat );
%
% corrcoef(y)
% ans =
% 1 0.19927 0.40156
% 0.19927 1 0.50008
% 0.40156 0.50008 1
%
% CorrMat =
% 1 0.2 0.4
% 0.2 1 0.5
% 0.4 0.5 1
%
% For more information see: Aggregration of Correlated Risk Portfolios:
% Models and Algorithms; Shaun S. Wang, Phd. Casualty Actuarial Society
% Proceedings Volume LXXXV www.casact.org
%
% Author: Stephen Lienhard
% Error checking
if nargin < 3
error('Must have at least 3 input arguements')
end
if numel(Simulations) ~= 1 || Simulations < 0
error('The number of simulations must be greater then zero and a scalar')
end
if nargin == 3
CorrMat = eye(numel(Mu));
elseif size(CorrMat,1) ~= size(CorrMat,2)
error('The correlation matrix must have the same number of rows as columns')
end
if numel(Mu) ~= numel(Sigma)
error('Mu and Sigma must have the same number of elements')
end
% Force column vectors
Mu = Mu(:);
Sigma = Sigma(:);
% Calculate the covariance structure
sigma_down = repmat( Sigma' , numel(Sigma), 1 );
sigma_acrs = repmat( Sigma , 1 , numel(Sigma) );
covv = log( CorrMat .* sqrt(exp(sigma_down.^2)-1) .* ...
sqrt(exp(sigma_acrs.^2)-1) + 1 );
% The Simulation
y = exp( mvnrnd( Mu , covv , Simulations ));
end
function y = MvNRand( Mu , Sigma , Simulations , CorrMat )
%MVLOGNRAND MultiVariant Lognormal random numbers with correlation
%
% Mu: The Lognormal parameter Mu (can be column or row vector)
%
% Sigma: The Lognormal parameter Sigma (can be column or row vector)
%
% Simulations: The Number of simulations to run (scalar)
%
% CorrMat: OPTIONAL A square matrix with the number of rows and columns
% equal to the number of elements in Mu/Sigma. Each element on the
% diagonal is equal to one, with the off diagonal cells equal to the
% correlation of the marginal Lognormal distributions. If not specified,
% then assume zero correlation.
%
% To check the simulation run corrcoef(Y) and that should be the same as
% your CorrMat.
%
% REQUIRES THE STATISTICS TOOLBOX
%
% Example:
% Mu = [ 11 12 13 ];
% Sigma = [ .1 .3 .5 ];
% Simulations = 1e6;
% CorrMat = [1 .2 .4 ; .2 1 .5 ; .4 .5 1];
% y = MvLogNRand( Mu , Sigma , Simulations , CorrMat );
%
% corrcoef(y)
% ans =
% 1 0.19927 0.40156
% 0.19927 1 0.50008
% 0.40156 0.50008 1
%
% CorrMat =
% 1 0.2 0.4
% 0.2 1 0.5
% 0.4 0.5 1
%
% For more information see: Aggregration of Correlated Risk Portfolios:
% Models and Algorithms; Shaun S. Wang, Phd. Casualty Actuarial Society
% Proceedings Volume LXXXV www.casact.org
%
% Author: Stephen Lienhard
% Error checking
if nargin < 3
error('Must have at least 3 input arguements')
end
if numel(Simulations) ~= 1 || Simulations < 0
error('The number of simulations must be greater then zero and a scalar')
end
if nargin == 3
CorrMat = eye(numel(Mu));
elseif size(CorrMat,1) ~= size(CorrMat,2)
error('The correlation matrix must have the same number of rows as columns')
end
if numel(Mu) ~= numel(Sigma)
error('Mu and Sigma must have the same number of elements')
end
% Force column vectors
Mu = Mu(:);
Sigma = Sigma(:);
% Calculate the covariance structure
sigma_down = repmat( Sigma' , numel(Sigma), 1 );
sigma_acrs = repmat( Sigma , 1 , numel(Sigma) );
covv = log( CorrMat .* sqrt(exp(sigma_down.^2)-1) .* ...
sqrt(exp(sigma_acrs.^2)-1) + 1 );
% The Simulation
y = mvnrnd( Mu , covv , Simulations );
end
function y = MvtRand( Mu , Nu , Simulations , CorrMat )
%MVLOGNRAND MultiVariant Lognormal random numbers with correlation
%
% Mu: The Lognormal parameter Mu (can be column or row vector)
%
% Sigma: The Lognormal parameter Sigma (can be column or row vector)
%
% Simulations: The Number of simulations to run (scalar)
%
% CorrMat: OPTIONAL A square matrix with the number of rows and columns
% equal to the number of elements in Mu/Sigma. Each element on the
% diagonal is equal to one, with the off diagonal cells equal to the
% correlation of the marginal Lognormal distributions. If not specified,
% then assume zero correlation.
%
% To check the simulation run corrcoef(Y) and that should be the same as
% your CorrMat.
%
% REQUIRES THE STATISTICS TOOLBOX
%
% Example:
% Mu = [ 11 12 13 ];
% Sigma = [ .1 .3 .5 ];
% Simulations = 1e6;
% CorrMat = [1 .2 .4 ; .2 1 .5 ; .4 .5 1];
% y = MvLogNRand( Mu , Sigma , Simulations , CorrMat );
%
% corrcoef(y)
% ans =
% 1 0.19927 0.40156
% 0.19927 1 0.50008
% 0.40156 0.50008 1
%
% CorrMat =
% 1 0.2 0.4
% 0.2 1 0.5
% 0.4 0.5 1
%
% For more information see: Aggregration of Correlated Risk Portfolios:
% Models and Algorithms; Shaun S. Wang, Phd. Casualty Actuarial Society
% Proceedings Volume LXXXV www.casact.org
%
% Author: Stephen Lienhard
% Error checking
if nargin < 3
error('Must have at least 3 input arguements')
end
if numel(Simulations) ~= 1 || Simulations < 0
error('The number of simulations must be greater then zero and a scalar')
end
if nargin == 3
CorrMat = eye(numel(Mu));
elseif size(CorrMat,1) ~= size(CorrMat,2)
error('The correlation matrix must have the same number of rows as columns')
end
% Force column vectors
Mu = Mu(:);
% Calculate the covariance structure
% sigma_down = repmat( Sigma' , numel(Sigma), 1 );
% sigma_acrs = repmat( Sigma , 1 , numel(Sigma) );
% covv = log( CorrMat .* sqrt(exp(sigma_down.^2)-1) .* ...
% sqrt(exp(sigma_acrs.^2)-1) + 1 );
% The Simulation
y = Mu' + mvtrnd(CorrMat, Nu, Simulations);
end

KDE2D

fast and accurate state-of-the-art bivariate kernel density estimator with diagonal bandwidth matrix.
function [bandwidth,density,X,Y]=kde2d(data,n,MIN_XY,MAX_XY)
% fast and accurate state-of-the-art
% bivariate kernel density estimator
% with diagonal bandwidth matrix.
% The kernel is assumed to be Gaussian.
% The two bandwidth parameters are
% chosen optimally without ever
% using/assuming a parametric model for the data or any "rules of thumb".
% Unlike many other procedures, this one
% is immune to accuracy failures in the estimation of
% multimodal densities with widely separated modes (see examples).
% INPUTS: data - an N by 2 array with continuous data
% n - size of the n by n grid over which the density is computed
% n has to be a power of 2, otherwise n=2^ceil(log2(n));
% the default value is 2^8;
% MIN_XY,MAX_XY- limits of the bounding box over which the density is computed;
% the format is:
% MIN_XY=[lower_Xlim,lower_Ylim]
% MAX_XY=[upper_Xlim,upper_Ylim].
% The dafault limits are computed as:
% MAX=max(data,[],1); MIN=min(data,[],1); Range=MAX-MIN;
% MAX_XY=MAX+Range/4; MIN_XY=MIN-Range/4;
% OUTPUT: bandwidth - a row vector with the two optimal
% bandwidths for a bivaroate Gaussian kernel;
% the format is:
% bandwidth=[bandwidth_X, bandwidth_Y];
% density - an n by n matrix containing the density values over the n by n grid;
% density is not computed unless the function is asked for such an output;
% X,Y - the meshgrid over which the variable "density" has been computed;
% the intended usage is as follows:
% surf(X,Y,density)
% Example (simple Gaussian mixture)
% clear all
% % generate a Gaussian mixture with distant modes
% data=[randn(500,2);
% randn(500,1)+3.5, randn(500,1);];
% % call the routine
% [bandwidth,density,X,Y]=kde2d(data);
% % plot the data and the density estimate
% contour3(X,Y,density,50), hold on
% plot(data(:,1),data(:,2),'r.','MarkerSize',5)
%
% Example (Gaussian mixture with distant modes):
%
% clear all
% % generate a Gaussian mixture with distant modes
% data=[randn(100,1), randn(100,1)/4;
% randn(100,1)+18, randn(100,1);
% randn(100,1)+15, randn(100,1)/2-18;];
% % call the routine
% [bandwidth,density,X,Y]=kde2d(data);
% % plot the data and the density estimate
% surf(X,Y,density,'LineStyle','none'), view([0,60])
% colormap hot, hold on, alpha(.8)
% set(gca, 'color', 'blue');
% plot(data(:,1),data(:,2),'w.','MarkerSize',5)
%
% Example (Sinusoidal density):
%
% clear all
% X=rand(1000,1); Y=sin(X*10*pi)+randn(size(X))/3; data=[X,Y];
% % apply routine
% [bandwidth,density,X,Y]=kde2d(data);
% % plot the data and the density estimate
% surf(X,Y,density,'LineStyle','none'), view([0,70])
% colormap hot, hold on, alpha(.8)
% set(gca, 'color', 'blue');
% plot(data(:,1),data(:,2),'w.','MarkerSize',5)
%
% Reference:
% Kernel density estimation via diffusion
% Z. I. Botev, J. F. Grotowski, and D. P. Kroese (2010)
% Annals of Statistics, Volume 38, Number 5, pages 2916-2957.
global N A2 I
if nargin<2
n=2^8;
end
n=2^ceil(log2(n)); % round up n to the next power of 2;
N=size(data,1);
if nargin<3
MAX=max(data,[],1); MIN=min(data,[],1); Range=MAX-MIN;
MAX_XY=MAX+Range/2; MIN_XY=MIN-Range/2;
end
scaling=MAX_XY-MIN_XY;
if N<=size(data,2)
error('data has to be an N by 2 array where each row represents a two dimensional observation')
end
transformed_data=(data-repmat(MIN_XY,N,1))./repmat(scaling,N,1);
%bin the data uniformly using regular grid;
initial_data=ndhist(transformed_data,n);
% discrete cosine transform of initial data
a= dct2d(initial_data);
% now compute the optimal bandwidth^2
I=(0:n-1).^2; A2=a.^2;
t_star=root(@(t)(t-evolve(t)),N);
p_02=func([0,2],t_star);p_20=func([2,0],t_star); p_11=func([1,1],t_star);
t_y=(p_02^(3/4)/(4*pi*N*p_20^(3/4)*(p_11+sqrt(p_20*p_02))))^(1/3);
t_x=(p_20^(3/4)/(4*pi*N*p_02^(3/4)*(p_11+sqrt(p_20*p_02))))^(1/3);
% smooth the discrete cosine transform of initial data using t_star
a_t=exp(-(0:n-1)'.^2*pi^2*t_x/2)*exp(-(0:n-1).^2*pi^2*t_y/2).*a;
% now apply the inverse discrete cosine transform
if nargout>1
density=idct2d(a_t)*(numel(a_t)/prod(scaling));
density(density<0)=eps; % remove any negative density values
[X,Y]=meshgrid(MIN_XY(1):scaling(1)/(n-1):MAX_XY(1),MIN_XY(2):scaling(2)/(n-1):MAX_XY(2));
end
bandwidth=sqrt([t_x,t_y]).*scaling;
end
%#######################################
function [out,time]=evolve(t)
global N
Sum_func = func([0,2],t) + func([2,0],t) + 2*func([1,1],t);
time=(2*pi*N*Sum_func)^(-1/3);
out=(t-time)/time;
end
%#######################################
function out=func(s,t)
global N
if sum(s)<=4
Sum_func=func([s(1)+1,s(2)],t)+func([s(1),s(2)+1],t); const=(1+1/2^(sum(s)+1))/3;
time=(-2*const*K(s(1))*K(s(2))/N/Sum_func)^(1/(2+sum(s)));
out=psi(s,time);
else
out=psi(s,t);
end
end
%#######################################
function out=psi(s,Time)
global I A2
% s is a vector
w=exp(-I*pi^2*Time).*[1,.5*ones(1,length(I)-1)];
wx=w.*(I.^s(1));
wy=w.*(I.^s(2));
out=(-1)^sum(s)*(wy*A2*wx')*pi^(2*sum(s));
end
%#######################################
function out=K(s)
out=(-1)^s*prod((1:2:2*s-1))/sqrt(2*pi);
end
%#######################################
function data=dct2d(data)
% computes the 2 dimensional discrete cosine transform of data
% data is an nd cube
[nrows,ncols]= size(data);
if nrows~=ncols
error('data is not a square array!')
end
% Compute weights to multiply DFT coefficients
w = [1;2*(exp(-i*(1:nrows-1)*pi/(2*nrows))).'];
weight=w(:,ones(1,ncols));
data=dct1d(dct1d(data)')';
function transform1d=dct1d(x)
% Re-order the elements of the columns of x
x = [ x(1:2:end,:); x(end:-2:2,:) ];
% Multiply FFT by weights:
transform1d = real(weight.* fft(x));
end
end
%#######################################
function data = idct2d(data)
% computes the 2 dimensional inverse discrete cosine transform
[nrows,ncols]=size(data);
% Compute wieghts
w = exp(i*(0:nrows-1)*pi/(2*nrows)).';
weights=w(:,ones(1,ncols));
data=idct1d(idct1d(data)');
function out=idct1d(x)
y = real(ifft(weights.*x));
out = zeros(nrows,ncols);
out(1:2:nrows,:) = y(1:nrows/2,:);
out(2:2:nrows,:) = y(nrows:-1:nrows/2+1,:);
end
end
%#######################################
function binned_data=ndhist(data,M)
% this function computes the histogram
% of an n-dimensional data set;
% 'data' is nrows by n columns
% M is the number of bins used in each dimension
% so that 'binned_data' is a hypercube with
% size length equal to M;
[nrows,ncols]=size(data);
bins=zeros(nrows,ncols);
for i=1:ncols
[dum,bins(:,i)] = histc(data(:,i),[0:1/M:1],1);
bins(:,i) = min(bins(:,i),M);
end
% Combine the vectors of 1D bin counts into a grid of nD bin
% counts.
binned_data = accumarray(bins(all(bins>0,2),:),1/nrows,M(ones(1,ncols)));
end
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
function t=root(f,N)
% try to find smallest root whenever there is more than one
N=50*(N<=50)+1050*(N>=1050)+N*((N<1050)&(N>50));
tol=10^-12+0.01*(N-50)/1000;
flag=0;
while flag==0
try
t=fzero(f,[0,tol]);
flag=1;
catch
tol=min(tol*2,.1); % double search interval
end
if tol==.1 % if all else fails
t=fminbnd(@(x)abs(f(x)),0,.1); flag=1;
end
end
end